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1. INTRODUCTION 

Some nations, such as Korea, Japan, Thailand, and Indonesia, have the traditional script. Indonesia 
has more than one traditional script because Indonesia consists of various regional tribes, each tribe has 
cultural diversity, and some even have tribe letters. For example, the Javanese tribe has a Javanese script, the 
Sunda tribe has aksara Sunda [1], Lampung has aksara Kaganga [2], Lombok island has aksara Sasak [3], 
Makasar has aksara Lontara [4], and Batak tribe has aksara Batak [5]. Javanese script consists of 20 letters 
that come from the legend of Ajisaka. In the legend, there is a fight between two servants of Ajisaka, namely 
Dora and Sembada. Both of them died because they were equally strong. To commemorate his two servants, 
Ajisaka made up the story of his two servants in a series of letters known as Javanese script or Hanacaraka 
letters [6]. Handwritten Javanese character is an exciting topic to study for scientific purposes and to preserve 
Indonesian culture. The scope of the research is also very vast, including image processing in Javanese 
documents, applications changing input written Javanese language text into Hanacaraka text and vice versa, 
Javanese character image classification, and many more. 

In image classification, there are some general stages: preprocessing, feature extraction and 
selection, and classification [7]. There are various method for preprocessing step on handwritten character 
such as denoising [8]—[13], dilation [14], [15], binarization [16], [17], skeletonization [18]. Some research 
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need feature extraction, such as horizontal and vertical profile image [16], [19], zoning method 
[11], [18], [20], histogram of oriented gradient (HOG) feature [18], mesh and local line direction (LLD) [17], 
and fast fourier transform (FFT) [19]. Artificial neural network (ANN) is one powerful machine learning and 
helpful for classification, clustering, pattern recognition, and prediction [17]. ANN applied in Javanese 
character problem using backpropagation learning with good performance [10], [15], [21], [22]. 

Several previous studies focused on handwritten recognition of Javanese script as follows. 
Fauziah et al. [14] obtained 87.5% of accuracy by using dilation, the Otsu method, canny edge, and contour 
as preprocessing and convolutional neural network (CNN) classifier. Rismiyati et al. [18] obtained 88.45% of 
accuracy by using the HOG feature and support vector machine (SVM). Another Rismiyati et al. [23] 
research did not use any preprocessing technique except grayscaling. They used deep neural network (DNN) 
and CNN for the recognition step, and the experiment obtained 64.65% and 70.22% on accuracy. Budhi and 
Adipranat [24] used grayscaling and prefiltering to reduce noise before the segmentation. This study is using 
image centroid zone-zone centroid zone (ICZ-ZCZ) feature extraction and three classification engines with 
an accuracy of 3.17% counterpropagation network (CPN), 58.12% evolutionary neural network (ENN) one 
layer, and 59.31% ENN 2 layers. Widiarti et al. [25] used slanting, lowpass filtering, and thinning as 
preprocessing and the similarity classifier and obtained 79.6% of accuracy average. Wibowo et al. [26] using 
deep learning so is not need either preprocessing or feature extraction technique. Sari et al. [27] use median 
filter and dilation as preprocessing, roundness and eccentricity as features, and K-NN as a classifier. The 
result gives 87.5% of accuracy and also shows that the preprocessing step, median filter, and dilation 
significantly improve system accuracy when used together. Sugianela and Suciati [28] use HOG features and 
multiclass SVM classifier, which shows a higher accuracy value compared to random forest (RF), k-nearest 
neighbor (kKNN), and ANN, 81.3%. Mahastama and Krisnawati [29] proposed optical character recognition 
(OCR) for Javanese script using projection profile for segmentation, binary image features, and nearest 
centroid classifier (NCC) for classification. The experiment obtained 60.6% recognition accuracy. Susanto et 
al. [30] use local binary pattern (LBP) as feature extraction and KNN classifier and perform 82.5% of 
accuracy. Susanto's following research added metrics and eccentricity features. The addition of these two 
features increased the accuracy of KNN classifier to 92.5% [31]. In another research, Susanto et al. [32] use 
median filter and thresholding as preprocessing and HOG features. Using KNN classier at K=1, the highest 
accuracy obtained at 98.5%. Rasyidi et al. [33] analyze the effect of thinning process as preprocessing step. 
The result shows that the thinning process did not significantly adjust the accuracy. On the contrary, it 
decreases from 96.29% to 91.84%. The research uses HOG as feature extraction and RF algorithm as image 
recognition. Diqi et al. [34] compared CNN parameters and pooling filter size and did not implement any 
preprocessing step. The best accuracy achieves with parameter 5x5 filter average pooling 93%. 

This research aimed to analyze preprocessing methods' impact on the handwritten Javanese 
character dataset. The preprocessing methods are dilation, skeletonization, and noise reduction. To evaluate 
the effectiveness of each preprocessing method, use the ANN as character recognition. 


2. METHOD 

This research aims to compare some preprocessing techniques to support the recognition of Javanese 
handwriting. Figure | shows the proposed method. The processing starts with segmentation for region of 
interest (ROI) extraction, preprocessing using various ways, and finally recognition step to measure the 
effectiveness of preprocessing step, see Figure 1. As seen in the figure, this research uses a raw preprocessing 
image without any extracted features as input. After preprocessing, the image matrix, which is the 
preprocessing result, is directly sent to the classifier. 


Input: Raw Output: 


ROI 


Javanese : Preprocessing Classification Class 
script extraction Kandiviitten 
image scrips 


Figure 1. Research method 


2.1. Hanacaraka letters 

Javanese letters, better known as Hanacaraka, consist of 20 letters (as seen in Figure 2) from the 
legend of Ajisaka. In the legend, there is a fight between two servants of Ajisaka, namely Dora and Sembada. 
In the battle, both of them died because they were equally strong, as seen in the meaning of Hanacaraka in 
Figure 2. To commemorate his two servants, Ajisaka made the story of the two in alphabetical order [6]. 
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Un &) I N M 


Ha Na Ca Ra Ka =ono wong loro (there are two people) 


ha na ca ra ka 
a) an a © aul Da Ta Sa WaLa =podho kerengan (they both fight) 
da ta sa wa la 
ua a wu em Pa Dha Ja Ya Nya =podho joyone (equally strong) 
pa dha ja va nya 
oa cn ka © Ma Ga Ba Tha Nga=mergo dadi bathang lorone (hence all of them are dead/both 
ma ga ba tha nga die because they are equally strong) 


Figure 2. Javanese letters [6] 


2.2. Hanacaraka letters dataset 

The Javanese script dataset is from the Kaggle database with the name aksara Jawa: aksara Jawa 
custom dataset. This dataset consists of 2,154 train data images with different numbers in each class and 480 
evaluation data [35]. Table 1 shows distributed data in each class. Each image dataset is an RGB color model 
and 224x224 pixels in size. Twenty-five example of test data images shows in Figure 3. 


Table 1. Number of data train and data test in each class 
Class | Number of data train | Number of data test Class | Number of data train | Number of data test 


Ha 102 24 Pa 108 24 
Na 108 24 Dha 108 24 
Ca 108 24 Ja 108 24 
Ra 108 24 Ya 108 24 
Ka 108 24 Nya 108 24 
Da 108 24 Ma 108 24 
Ta 108 24 Ga 108 24 
Sa 108 24 Ba 114 24 
Wa 108 24 Tha 108 24 
La 108 24 Nga 102 24 


AMON T ASMA NNR AMMA THIN 
(TAT TANIA AACR MOLT TNA 


AMAA ARMM AYMAN MANE IM EMEA NENA? 


Figure 3. Sample of Hanacaraka letters 


2.3. Region of interest extraction 

The ROI extraction step removes any unnecessary background, so the image is compact with the 
Hanacaraka character object. The shape of the Javanese character is different from the letters of the alphabet, 
where the form of one Javanese character can consist of two sub-images, such as the letter “nga” in Figure 4. 
The letter “nga” consists of two sub-image components, which becomes an obstacle in the ROI taking 
process. Because the letter “nga” consists of 2 sub-images, it will produce two convex sub-regions. It is 
necessary to do a process that helps combine the two sub-images before extracting ROI. 


LI (CLI) 


Figure 4. Components of the letter “nga” 


Figure 5 shows the ROI extraction step. First, the image is inverse, so the object has a white pixel, 
and the background has a black pixel. Second, closing operation using circle structuring element 15 pixels in 
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diameter. Third, find the extrema point to get the outer corner of an object using regionprops function in 
MATLAB. There are eight outer corners can be extracted, top-left, top-right, right-top, right-bottom, bottom- 
right, bottom-left, left-bottom, and left-top. The position of the top-left, for example, is not exactly the top- 
left of the square bounding box of the object. The fourth step is to solve the problem of finding four square 
bounding box corners: top-left, top-right, bottom-right, and bottom-left. After finding these four points, the 
image object Hanacaraka character is extracted. Figure 6 is the algorithm to find four corners for ROI 
extraction of the Javanese character based on the extrema point. 


Onginal inweme Closing ROI Extracted 


m m l T 


Figure 5. ROI extraction steps 


1.Find extrema point 

2.Find the minimum row-index of extrema point set and set as minrow 
3.Find the maximum row-index of extrema point set and set as maxrow 
4.Find the minimum column-index of extrema point set and set as mincol 
5.Find the maximum column-index of extrema point set and set as maxcol 
6.Get ROI pixel (minrow until maxrow, mincol until maxcol) 


Figure 6. ROI extraction algorithm 


2.4. Preprocessing 

This research uses a combination of preprocessing steps: dilation, skeletonization, and noise 
reduction. Eight combinations of preprocessing methods are none preprocessing (A), dilation (B), 
skeletonization (C), noise reduction (D), dilation—skeletonization (E), dilation—noise reduction (F), noise 
reduction-skeletonization (G), and dilation—noise reduction—skeletonization (H). This research uses two 
background colors: white and black, with code “w” for white and “b” for black. 


2.4.1. Dilation 

Dilation is a morphological operation that thickens the object pixels or eliminates small gaps by 
adding additional pixels around the existing object [36]. Figure 7 shows an example of applying dilation to a 
binary image with a black background color. The dilation image shows a thickening of the object's pixels 
according to the size of the structural element. 


Input Image Dilation Image Dilation Image 
SE disk 1 SE disk 3 


CO AAT Aah 


Figure 7. Dilation using disk structural element, (left) original image, (middle) disk=1, (right) disk=3 


2.4.2. Skeletonization 

The next step, skeletonization, is also a morphological algorithm that aims to obtain a skeleton from 
the shape of the image object [36]. Skeletonization reduces the object's pixels until it forms a line with one- 
pixel thickness by successive erosion of A and opening operation, see Figure 8. Like other morphological 
techniques, skeletonization also uses a kernel to perform its functions. 
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2.4.3. Noise reduction (denoising) 

Denoising aims to remove noise in the image. Noise is a small set of pixels isolated from other parts 
of the object-denoising method using a filtering method with kernel operators. The application of this method 
resulted in the blurring/mixing intensity of each pixel to its neighboring pixels. This research uses a Gaussian 
kernel to eliminate the noise. In the spatial domain, it is usually called filtering or convolution. For 3x3 
kernel (w) in Figure 9, the result output image g(x,y) spatial filtering at point (x,y) in the input image 


f(x,y) is: 


gy) = Ys—-1 Li WCS Of (X + Sy +t) (1) 
Input Image Skeleton Image oput ri Output image 
Il 
Figure 8. Result of skeletonization Figure 9. Sliding window spatial filtering [37] 


2.5. Artificial neural network classifier 

The ANN are the artificial representations of the human brain that simulate the learning process in 
the human brain. The ANN algorithm consists of interconnected neurons that process input data to output 
data connected with weight. The ANN learning method, backpropagation, is a supervised, controlled learning 
algorithm with multiple layers to change the weights associated with neurons in the hidden layer and use the 
expected output known before [38]. 

The neural network (NN) structure in this research consists of 20 neurons on the output layer per the 
number of Javanese characters used: 20 basics (carakan) characters. Each neuron has a value of 0 or 1. For 
example, if the result is character “ha,” then the first neuron is one while the other neurons are 0. For the 
second character, “na,” the second neuron is 1, while the other neurons are 1, and so on. The input neuron 
use nxn image size because there is no feature extraction method, so the raw data is used as input. 

This research use ANN to measure the effectiveness of preprocessing step by performance. The 
experiment uses the various number of hidden layers and hidden nodes. There are three kinds of ANN 
architecture: three hidden layers, each 32, 64, and 128 nodes; five hidden layers, each 32, 32, 64, 64, and 128 
nodes; and seven hidden layers, each 32, 32, 64, 64, 128, 128, and 1,024 nodes. Other ANN parameters are 
4,096 input nodes (64x64 matrix) and 20 output nodes representing the class Javanese character. 


3. RESULTS AND DISCUSSION 

Figure 10 displays the results of applying each preprocessing code. The first line uses a white 
background input image, while the second uses a black background. The mean squared error (MSE) found 
the difference between the original image (A) and preprocessed image (B until H). Dw and Db have zero 
MSE, meaning the noise reduction did not give any change, and the original image does not have any noise. 
Dilation preprocessing, Bw and Bb, has the biggest MSE because this algorithm makes the thin line thicker. 
The effect of noise reduction and dilation shows in Fw and Fb, and the noise reduction does not change the 
image, so only dilation gives an impact, so the MSE is the same as Bw dan Bb. Preprocessing C, E, G, and H 
involve skeletonization after dilation and/or noise reduction. 

Table 2 shows the accuracy performance of each preprocessing combination method in each ANN 
configuration. In addition, one preprocessing, dilation, gives the best performance in white and black 
backgrounds. While using two preprocessing steps, the combination of dilation and noise reduction provides 
the best performance. The highest performance is achieved using three preprocessing on the black 
background and seven hidden layers of ANN architecture, reaching 98% accuracy. The experiment shows 
that black background gives good performance overall preprocessing combinations. Otherwise, on white 
background skeletonization addition makes the performance drop. It can be concluded that the black 
background gives more performance than the white one due to the skeletonization process, while the 
skeletonization algorithm calculates the white pixel to find the skeleton of an object, while in this case, the 
object is in a white pixel. 
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Aw Hw 


Bw Cw Dw Ew Fw Gw 


mse=0.1225 mse=0.0806 mse=0 mse=0.0104 mse=0.1225 mse=0.0771 mse=0.0104 


Cb 


T 


mse=0.1225 mse=0,0806 mse=0 mse=0.0834 mse=0, 1225 mse=0.0834 mse=0,0834 


Figure 10. Preprocessing results 


Table 2. Testing accuracy 
Testing accuracy using various number hidden layers 


and hidden nodes (%) 

Code Preprocessing method 3 hidden 5 hidden layers 7 hidden layers 

layers (32, 32, 64, 64, (32, 32, 64, 64, 

(32, 64, 128) 128) 128, 128, 1024) 
Aw White background 69 55 67 
Bw White background and dilation 88 86 87 
Cw White background and skeletonization 5 8 29 
Dw White background and noise reduction 64 60 48 
Ew White background, dilation, and skeletonization 75 69 60 
Fw White background, dilation, and noise reduction 88 80 85 
Gw White background, noise reduction and skeletonization 21 10 27 
Hw White background, dilation, noise reduction and skeletonization 62 65 68 
Ab Black background 95 92 97 
Bb Black background and dilation 95 93 95 
Cb Black background and skeletonization 92 80 94 
Db Black background and noise reduction 95 92 96 
Eb Black background, dilation, and skeletonization 90 82 95 
Fb Black background, dilation, and noise reduction 96 94 95 
Gb Black background, noise reduction and skeletonization 89 78 91 
Hb Black background, dilation, noise reduction and skeletonization 91 86 98 
Average 76 71 77 


4. CONCLUSION 

In this research, three preprocessing methods (dilation, skeletonization, and noise reduction) are 
applied to two kinds of background colors (white and black). In the preprocessing experiment by MSE value, 
noise reduction does not change the image much, so we can remove the existence of noise reduction as 
preprocessing. Dilation makes the most significant difference on MSE because it thicker the pixel. 
Otherwise, skeletonization has a little different because it thinner the pixel. The best ANN architecture is 
seven hidden layers which use 32, 32, 64, 64, 128, 128, and 1,024 nodes on each layer. 

The experiment performance shows that the preprocessing methods provide maximum accuracy 
when used together. A black background image is more suggested than white background because it gives 
better performance. The noise reduction method can be removed if the image is clear of noise. In future 
research, other preprocessing methods can be added, and feature extraction also needs to be done to reduce 
the number of features processed in machine learning. Some feature extraction methods are LBP, gray-level 
co-occurrence matrix (GLCM), projection profile, and HOG. 
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